Skip to content

feat(ci): Add comprehensive performance tracing with metrics analysis#13

Merged
neil-pozetroninc merged 6 commits intomainfrom
feat/add-performance-tracing
Oct 25, 2025
Merged

feat(ci): Add comprehensive performance tracing with metrics analysis#13
neil-pozetroninc merged 6 commits intomainfrom
feat/add-performance-tracing

Conversation

@neil-pozetroninc
Copy link
Copy Markdown
Contributor

@neil-pozetroninc neil-pozetroninc commented Oct 24, 2025

Summary

Implements comprehensive performance monitoring and analysis for the CI pipeline, building on the initial CPU profiling foundation.

Key Features

  1. Real Webhook Workload Simulation

    • Sends actual webhook payloads (issues, pull requests) during trace collection
    • Simulates realistic bot behavior instead of idle waiting
    • Captures performance under real request processing load
  2. Automated Metrics Extraction

    • New scripts/analyze-trace.py parses speedscope traces
    • Extracts: CPU time, sample counts, top hotspot methods, GC activity
    • Outputs both JSON (for storage) and markdown (for display)
  3. Baseline Comparison

    • Stores baseline metrics from main branch via GitHub Actions cache
    • Compares PR performance against baseline
    • Highlights regressions and improvements with visual indicators
  4. Historical Performance Tracking

    • Saves metrics to perf-history/ directory
    • Tracks trends over time (last 100 runs)
    • Enables long-term performance analysis
  5. GitHub Actions Summary Integration

    • Displays performance reports directly in job summaries
    • Shows top 10 hotspot methods
    • Baseline comparison with delta metrics

Changes

  • .github/workflows/dotnet.yml: Added webhook simulation, analysis, and reporting steps
  • scripts/analyze-trace.py: New metrics extraction and analysis tool
  • Enhanced artifact uploads with metrics and historical data

Test Plan

  • Tested locally with gh act - all steps execute successfully
  • Metrics extraction generates valid JSON and markdown
  • Webhook workload simulation sends requests successfully
  • Baseline comparison logic works correctly (shows "No baseline" on first run)
  • CI verification pending (will run on push)

Benefits

  • Early detection of performance regressions
  • Actionable insights from hotspot analysis
  • Historical trends for long-term monitoring
  • Automated performance validation on every PR

- Add dotnet tools directory to PATH for subsequent steps
- Switch from --providers to --profile cpu-sampling for actual CPU profiling
- This generates flamegraphs with real stack traces instead of empty traces

Previous approach only captured runtime events (GC, JIT) without CPU samples.
The cpu-sampling profile includes Microsoft-DotNETCore-SampleProfiler which
captures stack traces at 1ms intervals, enabling proper performance analysis.

Tested locally with act - generates 22KB trace with actual profiling data.

Co-Authored-By: Claude <noreply@anthropic.com>
@neil-pozetroninc neil-pozetroninc force-pushed the feat/add-performance-tracing branch from 8655a1e to 84d833d Compare October 25, 2025 00:23
…rical tracking

- Add webhook workload simulation to trace real request processing
- Create analyze-trace.py script to extract performance metrics
- Implement baseline comparison against main branch via Actions cache
- Add historical performance tracking in perf-history/ directory
- Display performance reports in GitHub Actions summary
- Upload metrics artifacts with 30-day retention

This provides actionable performance insights and trend tracking
over time, enabling early detection of performance regressions.
@neil-pozetroninc neil-pozetroninc changed the title fix(ci): Enable CPU profiling in performance trace job feat(ci): Add comprehensive performance tracing with metrics analysis Oct 25, 2025
The analyze-trace.py script was only parsing the 'sampled' profile
format (samples/weights arrays), but dotnet-trace produces 'evented'
format (open/close events with timestamps). This caused the script to
extract 0 samples from valid trace files.

Changes:
- Add support for evented profile type with O/C event parsing
- Track stack state and calculate frame durations from events
- Fix deprecated datetime.utcnow() usage
- Maintain backward compatibility with sampled format

The evented format is parsed by tracking open/close events to
reconstruct the call stack and measure time each frame spent active.
This produces accurate hotspot metrics from dotnet-trace output.
Add a step to post performance metrics as a PR comment, making
them immediately visible on the PR page without requiring users
to click through to job details.

The comment includes:
- Full performance summary with metrics
- Link to detailed trace in Actions
- Link to downloadable flamegraph artifact

Only runs on pull requests to avoid errors on direct pushes.
@github-actions
Copy link
Copy Markdown

📊 Performance Trace Report

Performance Trace Analysis

Summary

  • Total Samples: 2,472,055.181392
  • Estimated CPU Time: 2472055.18 ms
  • Thread Count: 11
  • GC Activity: 0.00% of samples

Top 10 Hotspot Methods

Method Samples CPU Time %
Threads 267,959.82584321743 267959.8 ms 10.8%
(Non-Activities) 267,959.82584321743 267959.8 ms 10.8%
Process64 HelloWorldBot (2898) Args: /home/runner/work/ProbotSharp/ProbotShar... 267,959.82584321743 267959.8 ms 10.8%
UNMANAGED_CODE_TIME 267,947.36513750156 267947.4 ms 10.8%
System.Threading.PortableThreadPool+WorkerThread.WorkerThreadStart() 117,877.286417375 117877.3 ms 4.8%
System.Threading.LowLevelLifoSemaphore.Wait(int32,bool) 117,855.87668861219 117855.9 ms 4.8%
System.Threading.LowLevelLifoSemaphore.WaitForSignal(int32) 117,841.08166596551 117841.1 ms 4.8%
System.Threading.LowLevelLifoSemaphore.WaitNative(class Microsoft.Win32.SafeH... 117,838.82833386786 117838.8 ms 4.8%
System.Threading.Monitor.Wait(class System.Object,int32) 60,029.42675780505 60029.4 ms 2.4%
System.Threading.ManualResetEventSlim.Wait(int32,value class System.Threading... 60,029.42675780505 60029.4 ms 2.4%

View full trace | Download flamegraph

Significantly improved readability and usability of performance trace reports:

**Readability Improvements:**
- Human-readable time formats (41.6 min vs 2498937ms)
- Visual progress bars for percentages using block characters
- Clear verdict at top (✅/⚠️/🔴) based on baseline comparison
- Collapsible sections to reduce information density

**Balanced View:**
- Show verdict + key metrics prominently
- Filter application hotspots from framework noise
- Categorize methods as app/framework/noise with emoji badges
- Full technical details available in collapsible sections

**New Features:**
- format_duration(): Convert ms to readable format
- format_percentage_bar(): Create visual █░ progress bars
- categorize_method(): Detect ProbotSharp app code vs framework
- Improved baseline comparison with trend indicators (↑↓→)
- Contextual help: "Changes within normal variance" guidance

**User Experience:**
- Before: Dense technical table, unclear if performance is good/bad
- After: Clean verdict, visual bars, filtered hotspots, actionable insights

Tested with real trace files - output is much more scannable and actionable.
@github-actions
Copy link
Copy Markdown

📊 Performance Trace Report

📊 Performance Summary

Verdict: ✅ No significant performance impact detected
Trace Duration: 41.1 min
Test Workload: 40 webhook requests (20 issues, 20 pull requests)


Key Metrics

Metric Value Visual
Total CPU Time 41.1 min ████████████████████ 100.0%
Thread Count 11 threads
GC Activity 0.00% ✅ Minimal ░░░░░░░░░░░░░░░░░░░░ 0.0%

🔥 Application Hotspots

No significant application hotspots detected. Most time spent in framework/wait operations.

📋 Full Method Details (all categories)
Method Category CPU Time Percentage
Threads 🔇 Noise 4.5 min █░░░░░░░░░░░ 10.8%
(Non-Activities) 🔇 Noise 4.5 min █░░░░░░░░░░░ 10.8%
Process64 HelloWorldBot (3000) Args: /home/runner/work/Pr... 🔇 Noise 4.5 min █░░░░░░░░░░░ 10.8%
UNMANAGED_CODE_TIME 🔇 Noise 4.5 min █░░░░░░░░░░░ 10.8%
System.Threading.PortableThreadPool+WorkerThread.WorkerTh... 🔇 Noise 2.0 min ░░░░░░░░░░░░ 4.8%
System.Threading.LowLevelLifoSemaphore.Wait(int32,bool) 🔇 Noise 2.0 min ░░░░░░░░░░░░ 4.8%
System.Threading.LowLevelLifoSemaphore.WaitForSignal(int32) 🔇 Noise 2.0 min ░░░░░░░░░░░░ 4.8%
System.Threading.LowLevelLifoSemaphore.WaitNative(class M... 🔇 Noise 2.0 min ░░░░░░░░░░░░ 4.8%
System.Threading.Monitor.Wait(class System.Object,int32) 🔇 Noise 1.0 min ░░░░░░░░░░░░ 2.4%
System.Threading.ManualResetEventSlim.Wait(int32,value cl... 🔇 Noise 1.0 min ░░░░░░░░░░░░ 2.4%
🔍 Technical Details
  • Total Samples: 2,466,212
  • CPU Time (raw): 2466211.73 ms
  • Sample Rate: 1ms intervals
  • Trace Format: Speedscope evented
  • GC Samples: 4 (0.00%)
  • Timestamp: 2025-10-25T02:40:13.576251Z

View full trace | Download flamegraph

…nalysis

Fixed critical bug where script summed inclusive times instead of calculating
exclusive times, causing massive overcounting (30s trace showed as 41 minutes).

Changes:
- Implemented stack-based exclusive time calculation for evented format
- Track both inclusive (total) and exclusive (self) time per frame
- Calculate wall-clock duration as max(profile durations), not sum
- Total CPU time = sum of exclusive times across all frames
- Update all output fields to use new metric names

Metrics:
- wall_clock_ms: Actual elapsed time (endValue - startValue)
- total_cpu_time_ms: Sum of all exclusive times
- Methods show both inclusive_ms and exclusive_ms
- Percentages calculated against wall_clock, not summed times

Verification:
Before: 2,498,937 ms (41 minutes) - WRONG
After: 30,016 ms (30 seconds) - CORRECT

The fix follows industry best practices (Brendan Gregg, Speedscope) for
calculating exclusive time: exclusive = inclusive - children_time.
@github-actions
Copy link
Copy Markdown

📊 Performance Trace Report

📊 Performance Summary

Verdict: ✅ No significant performance impact detected
Trace Duration: 30.0 sec
Test Workload: 40 webhook requests (20 issues, 20 pull requests)


Key Metrics

Metric Value Visual
Total CPU Time 4.4 min ████████████████████ 100.0%
Thread Count 11 threads
GC Activity 0.00% ✅ Minimal ░░░░░░░░░░░░░░░░░░░░ 0.0%

🔥 Application Hotspots

No significant application hotspots detected. Most time spent in framework/wait operations.

📚 Framework Methods (click to expand)
Method CPU Time Percentage
CPU_TIME 10 ms ░░░░░░░░░░░░░░░ 0.0%
Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Infrastructure.He... 1 ms ░░░░░░░░░░░░░░░ 0.0%
System.Diagnostics.Tracing.EventSource.CreateManifestAndDescriptors... 0 ms ░░░░░░░░░░░░░░░ 0.0%
System.Threading.Tasks.Task.SpinThenBlockingWait(int32,value class ... 0 ms ░░░░░░░░░░░░░░░ 0.0%
📋 Full Method Details (all categories)
Method Category CPU Time Percentage
UNMANAGED_CODE_TIME 🔇 Noise 4.4 min ██████████████████████████████████████████████████████████████████████████████████████████████████████████ 887.2%
CPU_TIME 📦 Framework 10 ms ░░░░░░░░░░░░ 0.0%
Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Infrast... 📦 Framework 1 ms ░░░░░░░░░░░░ 0.0%
System.Threading.PortableThreadPool+WorkerThread.WorkerTh... 🔇 Noise 0 ms ░░░░░░░░░░░░ 0.0%
System.Threading.LowLevelLifoSemaphore.Wait(int32,bool) 🔇 Noise 0 ms ░░░░░░░░░░░░ 0.0%
System.Threading.PortableThreadPool+GateThread.GateThread... 🔇 Noise 0 ms ░░░░░░░░░░░░ 0.0%
System.Diagnostics.Tracing.EventSource.CreateManifestAndD... 📦 Framework 0 ms ░░░░░░░░░░░░ 0.0%
System.Threading.Monitor.Wait(class System.Object,int32) 🔇 Noise 0 ms ░░░░░░░░░░░░ 0.0%
System.Threading.ManualResetEventSlim.Wait(int32,value cl... 🔇 Noise 0 ms ░░░░░░░░░░░░ 0.0%
System.Threading.Tasks.Task.SpinThenBlockingWait(int32,va... 📦 Framework 0 ms ░░░░░░░░░░░░ 0.0%
🔍 Technical Details
  • Wall-Clock Duration: 30016.18 ms
  • Total CPU Time: 266305.30 ms
  • Sample Rate: 1ms intervals
  • Trace Format: Speedscope evented
  • GC Samples: 0 (0.00%)
  • Timestamp: 2025-10-25T04:34:31.913379Z

View full trace | Download flamegraph

@neil-pozetroninc neil-pozetroninc merged commit d8f330f into main Oct 25, 2025
7 checks passed
@neil-pozetroninc neil-pozetroninc deleted the feat/add-performance-tracing branch October 25, 2025 04:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant